Estimating species phylogenies using coalescence times among sequences.

نویسندگان

  • Liang Liu
  • Lili Yu
  • Dennis K Pearl
  • Scott V Edwards
چکیده

The estimation of species trees (phylogenies) is one of the most important problems in evolutionary biology, and recently, there has been greater appreciation of the need to estimate species trees directly rather than using gene trees as a surrogate. A Bayesian method constructed under the multispecies coalescent model can consistently estimate species trees but involves intensive computation, which can hinder its application to the phylogenetic analysis of large-scale genomic data. Many summary statistics-based approaches, such as shallowest coalescences (SC) and Global LAteSt Split (GLASS), have been developed to infer species phylogenies for multilocus data sets. In this paper, we propose 2 methods, species tree estimation using average ranks of coalescences (STAR) and species tree estimation using average coalescence times (STEAC), based on the summary statistics of coalescence times. It can be shown that the 2 methods are statistically consistent under the multispecies coalescent model. STAR uses the ranks of coalescences and is thus resistant to variable substitution rates along the branches in gene trees. A simulation study suggests that STAR consistently outperforms STEAC, SC, and GLASS when the substitution rates among lineages are highly variable. Two real genomic data sets were analyzed by the 2 methods and produced species trees that are consistent with previous results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifying hybridization events in the presence of coalescence via model selection.

As DNA sequences have become more readily available, it has become increasingly desirable to infer species phylogenies from multigene data sets. Much recent work has centered around the recognition that substantial incongruence in single-gene phylogenies necessitates the development of statistical procedures to estimate species phylogenies that appropriately model the process of evolution at th...

متن کامل

A Six Nuclear Gene Phylogeny of Citrus (Rutaceae) Taking into Account Hybridization and Lineage Sorting

BACKGROUND Genus Citrus (Rutaceae) comprises many important cultivated species that generally hybridize easily. Phylogenetic study of a group showing extensive hybridization is challenging. Since the genus Citrus has diverged recently (4-12 Ma), incomplete lineage sorting of ancestral polymorphisms is also likely to cause discrepancies among genes in phylogenetic inferences. Incongruence of gen...

متن کامل

Estimating Speciation and Extinction Rates Using Phylogenies: Development and Implementation of a Probabilistic Model

Abstract Diversification rate, i.e. the speed at which lineages speciate or go extinct, is one of the most important metric in ecology and evolutionary biology. Approaches have been developed to estimate rates of speciation and extinction using the molecular phylogenies of extant species (tree describing the evolutionary relationships among species). The general approach consists in deriving th...

متن کامل

Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci.

The effective population sizes of ancestral as well as modern species are important parameters in models of population genetics and human evolution. The commonly used method for estimating ancestral population sizes, based on counting mismatches between the species tree and the inferred gene trees, is highly biased as it ignores uncertainties in gene tree reconstruction. In this article, we dev...

متن کامل

Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method.

An improved Bayesian method is presented for estimating phylogenetic trees using DNA sequence data. The birth-death process with species sampling is used to specify the prior distribution of phylogenies and ancestral speciation times, and the posterior probabilities of phylogenies are used to estimate the maximum posterior probability (MAP) tree. Monte Carlo integration is used to integrate ove...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Systematic biology

دوره 58 5  شماره 

صفحات  -

تاریخ انتشار 2009